Structural bioinformatics
نویسندگان
چکیده
Motivation: Protein-protein complexes are known to play key roles in many cellular processes. However, they are often not accessible to experimental study because of their low stability and difficulty to produce the proteins and assemble them in native conformation. Thus, docking algorithms have been developed to provide an in silico approach of the problem. A protein-protein docking procedure traditionally consists in two successive tasks: a search algorithm generates a large number of candidate solutions, and then a scoring function is used to rank them. Results: To address the second step, we developed a scoring function based on a Voronoï tessellation of the protein threedimensional structure. We showed that the Voronoï representation may be used to describe in a simplified but useful manner, the geometric and physico-chemical complementarities of two molecular surfaces. We measured a set of parameters on native proteinprotein complexes and on decoys, and used them as attributes in several statistical learning procedures: a logistic function, Support Vector Machines (SVM), and a genetic algorithm. For the later, we used ROGER, a genetic algorithm designed to optimize the area under the ROC curve. To further test the scores derived with ROGER, we ranked models generated by two different docking algorithms on targets of a blind prediction experiment, improving in almost all cases the rank of native-like solutions. Availability: http://genomics.eu.org/spip/-Bioinformatics-tools-
منابع مشابه
Comparing various attributes of prolactin hormones in different species: application of bioinformatics tools
Prolactin is mainly secreted by the anterior pituitary and is able to stimulate mammary gland development and lactation in mammalians. Although prolactins share a common ancestral gene encoding, they show species specific characteristics and their efficiency may be different in various mammals. The importance of protein structures of all sequences of this hormone have been studied by various bi...
متن کاملMining Biological Repetitive Sequences Using Support Vector Machines and Fuzzy SVM
Structural repetitive subsequences are most important portion of biological sequences, which play crucial roles on corresponding sequence’s fold and functionality. Biggest class of the repetitive subsequences is “Transposable Elements” which has its own sub-classes upon contexts’ structures. Many researches have been performed to criticality determine the structure and function of repetitiv...
متن کاملIn-silico study to identify the pathogenic single nucleotide polymorphisms in the coding region of CDKN2A gene
Background: CDKN2A, encoding two important tumor suppressor proteins p16 and p14, is a tumor suppressor gene. Mutations in this gene and subsequently the defect in p16 and p14 proteins lead to the downregulation of RB1/p53 and cancer malignancy. To identify the structural and functional effects of mutations, various powerful bioinformatics tools are available. The aim of this study is the ident...
متن کاملOn distance and similarity in fold space
Metric information on similarities and distances in fold space is essential for quantitative work in structural bioinformatics and structural biology. Here we derive a suitable metric for protein structures from the fundamental axioms of similarity. Derivation of the metric also clarifies the relationship between the interrelated concepts of distance and similarity.
متن کاملPreface to Introduction to Structural Bioinformatics
While many good textbooks are available on Protein Structure, Molecular Simulations, Thermodynamics and Bioinformatics methods in general, there is no good introductory level book for the field of Structural Bioinformatics. This book aims to give an introduction into Structural Bioinformatics, which is where the previous topics meet to explore three dimensional protein structures through comput...
متن کامل